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Abstract. We investigate the approximation for computing the sum 
aj + ■ ■ ■ + a n with an input of a list of nonnegative elements a\, ■ ■ • , a n . 
If all elements are in the range [0, 1], there is a randomized algorithm 
that can compute an (1 + e)-approximation for the sum problem in time 
qi r^iogiogn) x wnere e j s a constant in (0, 1). Our randomized algorithm 

is based on the uniform random sampling, which selects one element 
with equal probability from the input list each time. We also prove a 
lower bound fl( ■ v „ n ), which almost matches the upper bound, for 

this problem. 
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1 Introduction 

Computing the sum of a list of elements has many applications. This problem 
can be found in the high school textbooks. In the textbook of calculus, we often 
see how to compute the sum of a list of elements, and decide if it converges when 
the number of items is infinite. Let e be a real number at least 0. Real number 

s is an (1 + e)-approximation for the sum problem a%, a-i-, ■ ■ ■ ,a n if < 
s < (1 + e) 2~^Li a i- When we have a huge number of data items and need to 
compute their sum, an efficient approximation algorithm becomes essential. Due 
to the fundamental importance of this problem, looking for the sublinear time 
solution for it is an interesting topic of research. 

A similar problem is to compute the mean of a list of items ai, d2, • • • , a n , 
whose mean is defined by ai+a2 ^'" +a " . Using log i) random samples, one 
can compute the (1 + e)-approximation for the mean, or decides if it is at 
most S [5]. In [3], Canetti, Even, and Goldreich showed that the sample size 
is tight. In [6], Motwani, Panigrahy, and Xu showed an 0(y/n) time approx- 
imation scheme for computing the sum of n nonnegative elements. A priority 
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sampling approach for estimating subsets were studied in [1,4,2]. Using differ- 
ent cost and application models, they tried to build a sketch so that the sum of 
any subset can be computed approximately via the sketch. 

We feel the uniform sampling is more justifiable than the weighted sampling. 
In this paper, we study the approximation for the sum problem under both 
deterministic model and randomized model. In the randomized model, we still 
use the uniform random samplings, and show how the time is reversely depend 
on the total sum Xw=i a i- We also prove a lower bound that matches this time 
bound. An algorithm of time complexity 0( "fef log ") ) for computing a list of 

nonnegative elements a\,---,a n in [0,1] can be extended to a general list of 
nonnegative elements. It implies an algorithm of time complexity 0( ^« ) 

for computing a list of nonnegative elements of size at most M by converting 
each ai into ji, which is always in the range [0,1]. 



2 Randomized Algorithm for the Sum Problem 



In this section, we present a randomized algorithm for computing the approxi- 
mate sum of a list of numbers in [0,1]. 



2.1 Chernoff Bounds 



The analysis of our randomized algorithm often use the well known Chernoff 
bounds, which are described below. All proofs of this paper are self-contained 
except the following famous theorems in probability theory. 

Theorem 1 ([7]). Let X\, . . . , X n be n independent random 0-1 variables, where 
Xi takes 1 with probability pi. Let X = Xa=i Xi, an< ^ A 4 = ^[-^1- Then for any 
9>0, 



1. Pr(A < (l-e)fx) < es 

2. Yi{X > (l + 9)n) < 



and 



(l+0)(i+e) 

We follow the proof of Theorem 1 to make the following versions (Theorem 3, 
and Theorem 2) of Chernoff bound for our algorithm analysis. 

Theorem 2. Let Xi, . . . ,X n be n independent random 0-1 variables, where Xi 
takes 1 with probability at least p for i = l,...,n. Let X = Y]"—i Xj , and 
(X = E[X\. Then for any 9 > 0, Pr(A < (1 - 9)pn) < e -^V\ 

Theorem 3. Let X±, . . . ,X n be n independent random 0-1 variables, where Xi 
takes 1 with probability at most p for i = 1, . . . , n. Let X = X)™=i Then for 

any > 0, Pr(X > (1 + 9)pn) < 



(i+e) 



(1+9) 
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Define g±{0) = e ^ and g 2 {8) = {1+ f ){ i+ e) ■ Define g(9) = max(g 1 (9), g 2 (9)). 
We note that gi(9) and g 2 (9) are always strictly less than 1 for all 9 > 0. It is 
trivial for gi(9). For g 2 (9), this can be verified by checking that the function 
f(x) = x — (1 + x) ln(l + x) is decreasing and /(0) = 0. This is because f'(x) = 
— ln(l+x) which is strictly less than for all x > 0. Thus, g2(9) is also decreasing, 
and less than 1 for all 9 > 0. 



2.2 A Sublinear Time Algorithm 

In this section, we show an algorithm to compute the approximate sum in a 
sublinear time in the cases that YH=i a i 1S a * least (log log n) 1+6 for any constant 
e > 0. This is a randomized algorithm with uniform random sampling. 

Theorem 4. Let e be a positive constant in (0,1). There is a sublinear time 
algorithm such that given a list of items a%, a 2 , ■ ■ ■ , a n in [0, 1], it gives a (1 + e)- 
approximation in the time 0( "^ los ") ), 

Definition 1. 

— For each interval I and a list of items L, define A(I, L) to be the number of 
items of L in I . 

— For 8, and 7 in (0, 1), a (8, 7) -partition for [0, 1] divides the interval [0, 1] 
into intervals I x = [7ri,7r ],/2 = [7r 2 , 7Ti ) , 7 3 = [7r 3 ,7r 2 ), . • . ,h = [0,7Tfc_i) 
such that 7To = 1,7T, = 7Tj_i(l — 8) for i = 1, 2, . . . , k — 1, and n^-i is the 
first element itk-i < 

— For a set A, \A\ is the number of elements in A. For a list L of items, \L\ 
is the number of items in L. 

A brief description of the idea is presented before the formal algorithm and its 
proof. In order to get an (1 + e)-approximation for the sum of n input numbers 
in the list L, a parameter 8 is selected with 1 — | < (1 — <5) 3 . For a (8,8)- 
partition iiU/2 ■ • -U/fe for [0, 1], Algorithm Approximate-Sum(.) below gives the 
estimation for the number of items in each Ij if interval Ij has a sufficient number 
of items. Otherwise, those items in Ij can be ignored without affecting much of 
the approximation ratio. We have an adaptive way to do random samplings in a 
series of phases. Let denote the number of random samples in phase t. Phase 
t + 1 doubles the number of random samples of phase t (st+i = 2st). Let L be 
the input list of items in the range [0, 1]. Let dj be the number items in Ij from 
the samples. For each phase, if an interval Ij shows sufficient number of items 
from the random samples, the number of items A(Ij, L) in ij can be sufficiently 
approximated by A(Ij, L) = dj-j-. Thus, A(Ij, L)iTj also gives an approximation 

for the sum of the sizes of items in Ij. The sum apx_sum = J2i A\(Ij , L)-Kj for 
those intervals Ij with large number of samples gives an approximation for the 
total sum Y^i=i a i °f t ne input list. In the early stages, apx_sum is much smaller 
than j- . Eventually, apx_sum will surpass ^ . This happens when s t is more than 



and apx_sum is close to the sum J2"=i a i °f all items from the input list. 
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This indicates that the number of random samples is sufficient for approximation 
algorithm. For those intervals with small number of samples, their items only 
form a small fraction of the total sum. This process is terminated when ignoring 
all those intervals with none or small number of samples does not affect much of 
the accuracy of approximation. The algorithm gives up the process of random 
sampling when s t surpasses n, and switches to use a deterministic way to access 
the input list, which happens when the total sum of the sizes of input items is 
0(1). 

The computation time at each phase i is O(si). If phase t is the last phase, 
the total time is 0(s t + ^ + |£ 4 ) = 0(s t ), which is close to 0(™? )• Our 

final complexity upper bound is 0( n ^» lo8T ^ )) where log log n factor is caused 

by the probability amplification of O(logn) stages and O(logn) intervals of the 
(5, 8) partition in the randomized algorithm. 

Algorithm Approximate-Sum(e, a, n, L) 

Input: a parameter, a small parameter e G (0, 1), a failure probability upper 
bound a, an integer n, a list L of n items a\, . . . ,a n in [0, 1]. 
Steps: 



1. Phase 0: 

2. Select 8 = § that satisfies 1 - § < (1 - S) 3 . 

3. Let P be a (8, <5)-partition I x U I 2 ...UJ t for [0, 1]. 

4. Let Co be a parameter such that 8 (A + l)(logn)flr(5)^° lo s lo s")/ 2 < a for 
all large n. 

5. Let z := £o log log n. 

6. Let parameters c\ := 2(1+3)' anc ^ c 2 : ~ (i-g° ei • 

7. Let so := ■Z- 

8. End of Phase 0. 

9. Phase t: 

10. Let st ■= 2st-i- 

11. Sample St random items aj i; . . . , di 3 from the input list L. 

12. Let dj := : a; h £ 7j and 1 < h < st}\ for j = 1, 2, . . . , k. 

13. For each Ij, 

14. if dj > z, 

15. then let A(Ij,L) := j-dj to approximate A(Ij,L). 

16. else let A{I J: L) := 0. 

17. Let apx_sum := Yld >z L)ttj to approximate J2i=i a n- 

18. If apx.sum < 2c2 " 1 ° glog " and s t < n then enter Phase t+1. 

19. else 

20. Hs ( <n 

21. then let apx_sum := J2d >z L)iTj to approximate Xa<i<« a «- 

22. else let apx_sum := a »- 

23. Output apx_sum and terminate the algorithm. 

24. End of Phase t. 
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End of Algorithm 

Several lemmas will be proved in order to show the performance of the algo- 
rithm. Let 8, £o> ci, and c 2 be parameters defined as those in the Phase of the 
algorithm Approximate-Sum ( . ) . 

Lemma 1. 



1. For parameter 8 in (0, 1), a (<5, 8) -partition for [0, 1] has the number of in- 

tervals k = O( los "+'° s * ). 

g(x) < e~ 4 when < x < 
3. The parameter £o can be set to be Q{ ^ os °/ - ) = O( lo ^ 2 ° a ) for line 4 in the 

algorithm Approximate- Sum (.). 
4- Function g(x) is decreasing and g(x) < 1 for every x > 0. 

Proof. Statement 1: The number of intervals k is the least integer with (1 — S) k < 
£. We have k = O( log "+ log ^ ). 

Statement 2: By definition g(x) = max(<7i(a;), c^t^)), where = e - ^ 

and g 2 {x) = j^f^yj+zj- We just need to prove that g 2 (x) < e~~ when x < ^. 
By Taylor theorem ln(l + x) > x — Assume < x < A. We have 

In g2{x) = x — (1 + x) ln(l + x) 

x 2 

<x-(l+x)(x-—) 
x 2 

< . 

- 4 

Statement 3: We need to set up £o to satisfy the condition in line line 4 in 
the algorithm. It follows from statement 1 and statement 2. 

Statement 4: It follows from the fact that g 2 (x) is decreasing, and less than 
1 for each x > 0. We already explained in section 2.1. 

We use the uniform random sampling to approximate the number of items in 
each interval Ij in the (8, <5)-partition. Due to the technical reason, we estimate 
the failure probability instead of the success probability. 

Lemma 2. Let Qi be the probability that the following statement is false at the 
end of each phase: 

(i) For each interval Ij with dj > z, (1 — 8)A{Ij 1 L) < A(Ij,L) < (1 + 
8)A{I j ,L). 

Then for each phase in the algorithm, Q\ < (k + 1) • g(8) ? . 



Proof. An element of L in Ij is sampled (by an uniform sampling) with prob 

257 



ability pj = A ^' L ' , Let p' = ^f-. For each interval Ij with dj > z, we discuss 



two cases. 
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— Case 1. p' > pj. 

In this case, dj > z > 2p'st > 2pjSt- Note that dj is the number of ele- 
ments in interval Ij among St random samples , . . . , a% a from L. By The- 
orem 3 (with = 1), with probability at most P 1 = .g 2 (l) P3 " if < 92(l) p ' St < 
g2(l) z /' 2 < g(l) z / 2 , there are at least 2pjs t samples are from interval Ij. 
Thus, the probability is at most P\ for the condition of Case 1 to be true. 

— Case 2. p' < pj. 

By Theorem 3, we have Pi[dj > (1 + 8)pjm t ] < g 2 {5) Plint < g 2 (S) p ' St < 
g 2 (6)% <g(5)*. 

By Theorem 2, we have Pr[d, < (1 - S)pjm t ] < gi{8) p ' mt < gi(S) p ' St = 
gi(5)% < g(5)i. 

For each interval Ij with dj > z and (1 — 5)pjm t < dj < (1 + S)pjm t , we 
have (1 - 5)A(Ij , L) < A(Ij , L) < (1 + 5)A(Ij , L) by line 15 in Approximate- 
Sum^. ). 

There are k intervals 1%, . , . , Therefore, with probability at most P 2 = k ■ 
g(5)^ , the following is false: For each interval Ij with dj > z, (1— S)A(Ij, L) < 
A(Ij,L) < (l + 6)A(I j ,L). 

By the analysis of Case 1 and Case 2, we have Qi < Pi + P 2 < (k + 1) • g{5)^ 
(see statement 4 of Lemma 1). Thus, the lemma has been proven. 

Lemma 3. Assume that s t > C2 ^j° s log " . Then right after executing Phase t in 

Approximate- Sum(.) , with probability at most Q 2 = 2kg(S)^° lo s lo s n j the follow- 
ing statement is false: 

(ii) For each interval Ij with A{Ij,L) > Ci X]_ 1 Qj, A). (1 — 5)A(Ij,L) < 

A{h> L ) ^ i 1 + S ) A ( I 3> L ); and B )- dj > z. 

Proof. Assume that s t > C2 "'° sl ° s " . Consider each interval Ij with A(Ij,L) > 

c\"Y^l=\ a i- We have that pj = " 4< ~^' L ' ) > 1 ^~^ =1 ' ■ An element of L in Ij 
is sampled with probability pj. By Theorem 3, Theorem 2, and Phase of 
Approximatc-Sum(.), we have 

Pr[d 3 < (1 - 8)p 3 m t ] < gi {S) p > mt < 5l (,5)ciC2logiogn < 5 ^f loglogn ^ 
Pv[dj > (1 + S) Pj m t ] < g 2 {6) nmt < g 2 {5) ClC2 logl ° STl < g{S) io lo s lo s™. (2) 

Therefore, with probability at most 2fc<7((5)£° loglogn , the following statement 
is false: 

For each interval I 3 with A(Ij,L) > a Yh=i a h (1 - S)Mlh L ) ^ M 1 ^ L ) ^ 
(l + S)A(Ij,L). 

If dj > (1 — 5)pjSt, then we have 

d J >(l-5)^±±s t 
n 

> (1 _ J) ( Cl ^"=i a ») . ^ log log n 
« E"=i a i 
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= (1 - S)cic 2 log log n 

> £o log log n = z. (by Phase of Approximate-Sum(.)) 

Lemma 4. The total sum of the sizes of items in those Ijs with A(Ij,L) < 
c i Yh=i a i is at most |E"=i a i) + f • 



Proof. By Definition 1, we have itj = (1 — 6)° for j = 1, . . . , k — 1. We have that 



— the sum of sizes of items in Ik is at most n ■ ^ 



— for each interval Ij with A(Ij, L) < c\ a i> the sum of sizes of items in 
Ij is at most (ci Z^Li a^TTj-i < (ci SILi a 0(l ~ <0 J ' -1 f° r J e I 1 - fc _ !]• 

The total sum of the sizes of items in those IjS with A(Ij, L) < c\ Y2?=i ai ^ s a * 
most 

ft— 1 n ft— 1 n 

^( Cl ^ OijTTj-i) + ^ a k < ]T( Cl E a ^ - + n ■ ^2 

j=l i=l aie/ fc j=l i=l 

n - 

i—1 

5 n S 
< —(y aA H — . (by Phase of Approximate-S 
2 * — ' n 

i=l 

Lemma 5. Assume that at the end of phase t, for each Ij with A(Ij,L) > 

0, A(Ij,L)(l-S) < A(Ij,L) < A(Ij,L)(l + S); and d 3 > z if A(Ij,L) > 

c i Yh=i a i- Then C 1 ~ f )(2™=i a i - it) < apx_sum < (1 + *)(X)"=i a *) ai ^ e 
end of phase t. 

Proof. By the assumption of the lemma, we have apx_sum = J2d >z L)wj < 
(1 + ^)Sr=i ai - For eacn interval Ij with j ^ k. we have A(Ij,L)-Kj > (1 — 
S)^2 a . eI . ai by the definition of (6, <5)-partition. Thus, 

A(Ij,L)Trj >(l-6)J2 a i for ^ k - (3) 
By the condition of this lemma and Lemma 4,we have 

E E a *^(I>) + ^ ( 4 ) 

We have the following inequalities: 
apx_sum = A(Ij,L)7Tj (by line 18 in Approximate-Sum(.)) 

dj >z 
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>(l-5) A(Ij,L)%j 

>(1-S) 2 J2 \J2 a >} inequality (3)) 



>(1 




- H Z) ai - Z a <) 




1=1 




>(1 


-«) a (£* 

i=l 


s n s s 

- (o (Z a; ) + -)-«• -o) ( b y inequality (4)) 

i=l 


>(1 


n 

i=i 




>(1 


n 
i=l 


4<5 

). (By line 2 in Phase of the algorithm) 

n 



Lemma 6. With probability at most Q§ = (k + 1) • (log n)g(5) 2 , at least one of 
the following statements is false: 

A. For each phase t with s t < Bggajggg , the condition apxjsum < 2c2 " 1 ° glog " 

in line 18 of the algorithm is true. 

B. IfY^l—i ai > 4, then the algorithm stops some phase t with St < 16 ?"/° g log " ■ 

C. // X)i=i a i < 4, t/ien if stops af a phase t in which the condition s t > n first 
becomes true, and outputs apx_sum = ^27=1 a »- 

Proof. By Lemma 2, with probability at most (k+ 1) ■ g(S)^ , the statement i of 
Lemma 2 is false for a fixed m. The number of phases is at most logn since s t is 
double at each phase. With probability (k + 1) ■ (logn) • g(S)i , the statement i of 
Lemma 2 is false for each phase t with st < n. Assume that statement i of Lemma 
2 is true for every phase t executed by the algorithm Approximate-Sum(.). 

Statement A. Assume that s t < c \"J° glog " . We have ^ > c = 

y ._ 1 a i St 

Si^- Therefore, a t < (£)c 2 loglogn = 



gjfc^. Therefore, £™ =1 Oj < (£)c 2 loglog? 
Since statement i of Lemma 2 is true, the condition of Lemma 5 is satis- 
fied. By Lemma 5, apx_sum < (1 + 6) z\h=i a i- Since (1 + S) < 2 (by line 6 in 
Approximatc-Sum(.)), we have 

^ , r,f .,f ^ c 2 nloglogn 2c 2 nloglogn 

apx_sum < (1 + 0) y a t < 2 } a ?; < 2 • = . 

, , St s t 



Statement B. The variable s t is doubled in each new phase. 
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Assume that the algorithm enters phase t with 8c2 " 1 „ ogl ° s " < s t < 16c ^° k1 ° k " . 
We have 

- < - = SLi a » f 5 ) 

St - Scan log log n 8c 2 log log 71 ' V ' 

Since Ett a< > 4, (£? =1 * " f ) > (1 " *)(E2=i 
By Lemma 5, we have the inequality 



apx_sum > (1 - |)(1 - S)(^2 a{). (6) 



i=l 

By the setting at Phase of the algorithm, we have 



(l-i)(l-8)> -■- = -. (7) 
2 2 4 8 

We have 

n 

apxjsum > (1 — -)(1 — en) (by inequality (6)) (8) 

i=l 

6 72 

>(l--)(l-<S)(--8c 2 loglogn) (by inequality (5)) (9) 
2 s t 

3 71 

> (_. 8c2 loglogn) (10) 

O St 

= 3C2 " l0gl ° gn . (by inequality (7)) (11) 

Thus, it makes the condition at line 18 in Approximate-Sum(.) be false. Thus, 

the algorithm stops at some stage t with st < 16c ^i° s log - by the setting at 

2—ti=\ % 

line 18 in Approximate-Sum(.). 

Statement C. It follows from statement A and the setting in line 18 of the 
algorithm. 

Lemma 7. The complexity of the algorithm is 0( l ° s g " s min( , n) log log n) . 

In 'particular, the complexity is 0(min(^r? , n) log logrt) if a is fixed in (0,1). 

Proof. We check the size st of random samplings according by statement B and 
statement C of Lemma 6 to determine when to stop the algorithm. We have 

£o = 0{ 1 °^-2°' s ) by Lemma 1. By the setting in line 6 in Approximate-Sum(.), we 
have 
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Since Si is doubled every phase, and each phase i costs O(si) time. The total 
time of the algorithm is 0(si + S2 + ■ ■ ■ + s t ) = O(st), where phase t is the last 
phase. 

The computational time complexity of the algorithm follows from state- 
ment B and statement C of Lemma 6. 

Lemma 8. With probability at most a, at least one of the following statements 
is false after executing the algorithm Approximate- Sum[e, a, n 1 L): 

1 - J f E"=i a i ^ 4 > then (! _ e )(]Cr=i a i) ^ apx_sum < (1 + f )E"=i a *);' 

2. IfY^i=i a i < 4, then apx_sum = J2"=i a i> an d 

3. It runs in 0( '°^4° 7 min( — -, n) log logn) time. In particular, the complex- 
ity of the algorithm is 0(min( — ,n) log logn) i/a is fixed in (0,1). 

Proof. As St is doubled each new phase in Approximate-Intervals(.), the number 
of phases is at most logn. With probability at most (logn)(Qi + Q 2 ) + Qs < a 
(by line 5 in Approximate-Inter vals(.)), at least one of the statements (i) in 
Lemma 2, (ii) in Lemma 3, A, B, C in Lemma 6 is false. 

Assume that the statements (i) in Lemma 2, (ii) in Lemma 3, A, B, and C 
in Lemma 6 are all true. 

Statement 1: The condition of Statement 1 implies n > 4. By Lemma 5, we 
have 

(! - a >- — ) ^ apx.sum < (1 + 5)(J2 (12) 

i=l i = l 

Since J27=i ai — wc have 

" AX " 

--)>(! -*)($>)• (13) 

t=i n i=X 

We have the inequality 

n 

apx_sum > (1 - -)(1 - S)(^2 a i) (by inequalities (13) and (12)) (14) 

i=l 

> (1 — e)(V^ a>i). (by Phase in Approximate-Sum(.)) (15) 

i=l 

Statement 2 follows from Statement C of Lemma 6. 
Statement 3 for the running time follows from Lemma 7. 
Thus, with probability at most a, at least one of the statements 1 to 3 is 
false. 

Now we have the proof for our main theorem. 
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Proof (for Theorem 4). Let a = j and e £ (0, 1). It follows from Lemma 8 via a 
proper setting for those parameters in the algorithm Approximatc-Sum(.). 

The (6, <5)-partition P : hUl 2 . . XJl k for [0, 1] can be generated in ( lo s"+ lo «? ) 
time by Lemma 1. Let L be a list of n numbers in [0, 1]. Pass 6,a,P,n, and L 
to Approximate-Sum(.), which returns an approximate sum apx_sum. 

By statement 1 and statement 2 of Lemma 8, we have an (l+e)-approximation 
for the sum problem with failure probability at most a. The computational time 
is bounded by 0( lQ ^ 4 ~ min( ^r," - ,n)loglogn) by statement 3 of Lemma 8. 

Definition 2. Let f(n) be a function from n to (0, n] and a parameter c > 1. 
Define XX c >/( n )) be the class of sum problem with an input of nonnegative 
numbers oi, • ■ ■ ,a n with Y^i=i a i l~7> c f( n )\ ■ 

Corollary 1. Assume that f(n) is a function from n to (0,n\ and c is a given 
constant c greater than 1. There is a Q C'^^"^ ) time algorithm such that given 
a list of nonnegative numbers oi, ai, ■ ■ ■ , a n in Y)(c, f(n)), it gives a (1 — e)- 
approximation. 

Proof. It follows from Theorem 4. 

We can extend our sublinear time algorithm to the more general list of non- 
negative elements. 

Theorem 5. Assume that e is a positive constant in (0,1). Then there is an 
Q^ A/7 t( iogiogn) ^ n mR algorithm to compute (1 + e) -approximation for a list of 

nonnegative numbers a\, ■ ■ ■ , a n of in the range [0, M\. 

Proof. A list of nonnegative elements ax, - ■ ■ ,a n can be converted into the list 
ft, • • • , ^ in [0, 1]. It follows from Theorem 4. 

3 Lower Bound 

We show a lower bound for those sum problems with bounded sum of sizes 
y^ T -_i ai. The lower bound always matches the upper bound. 

Theorem 6. Assume f(n) is an nondecreasing unbounded function from N to 
N with f(n) = o(n). Every randomized (-y/c — e) -approximation algorithm for 
the sum problem in Y)(c, f{n)) needs fi(j^) time, where c is a constant greater 
than 1, and e is an arbitrary small constant in (0, \fc — 1). 

Proof. The first list L\ contains f(n) elements of size -, and its rest n — f(n) 
items are 0. The sum of numbers in the first list is Therefore, the first list 

c ' 

is a sum problem in XX C > /("))• 

The second list L 2 contains /(n) elements of value 1, and its rest n — f(n) 
items are 0. The sum of numbers in the second list is f(n). Therefore, the second 
list is a sum problem in J2( c i f( n ))- 
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Assume that an algorithm only has computational time °(y^y ) f° r computing 
fc-approximation for sum problems in XX c '/( n )) with k = (-y/c — e). For each 
uniform random sampling, with probability it gets an number greater than 
in each Li. The algorithm has an o(l) probability to access at least one item 
greater than in each list in a path of computation. Therefore, L\ and L2 have 
the same output for approximation by the same randomized algorithm. If s is a 
fc-approximation for the both sum problems, we have 

/(") k f( n ) 

l±-L < s < an d (16 

ck c 

^ < s < kf(n) (17) 

We have k ^ n ^ > for k = ^fc — e. This brings a contradiction. 

Corollary 2. There is no o( . v „" ) time randomized approximation scheme 

algorithm for the sum problem. 



4 Conclusions 



We studied the approximate sum in a few models. We show that the approximate 
sum can be computed in time O( "£^ los "' 1 ) if the input list in the range [0, 1]. 

Our lower bound almost matches the upper bound. An interesting theoretical 
problem is to close the small gap between the lower bound and upper bound for 
the approximate sum problem. 
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